Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier #3222

Merged
merged 4 commits into from
Jul 14, 2020

Conversation

giresg
Copy link
Contributor

@giresg giresg commented Jul 12, 2020

This PR fixes a bug in the LGBMClassifier class. The code as-is breaks when eval_metric is a list of strings. This is meant to work (as referenced in the documentation) and does work on all derived classes of LGBMModel except LGBMClassifier. The issue is in the following code in the master branch:

if self._n_classes > 2:
# Switch to using a multiclass objective in the underlying LGBM instance
ova_aliases = {"multiclassova", "multiclass_ova", "ova", "ovr"}
if self._objective not in ova_aliases and not callable(self._objective):
self._objective = "multiclass"
if eval_metric in {'logloss', 'binary_logloss'}:
eval_metric = "multi_logloss"
elif eval_metric in {'error', 'binary_error'}:
eval_metric = "multi_error"
else:
if eval_metric in {'logloss', 'multi_logloss'}:
eval_metric = 'binary_logloss'
elif eval_metric in {'error', 'multi_error'}:
eval_metric = 'binary_error'

with the fix the following line works as expected (see tests in PR):

gbm = lgb.LGBMClassifier(**params).fit(eval_metric=['fair', 'error'], **params_fit)

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gramirezespinoza LGTM! Thank you very much for the fix! Just one minor comment below.

@@ -922,3 +922,14 @@ def test_continue_training_with_model(self):
self.assertEqual(len(init_gbm.evals_result_['valid_0']['multi_logloss']), 5)
self.assertLess(gbm.evals_result_['valid_0']['multi_logloss'][-1],
init_gbm.evals_result_['valid_0']['multi_logloss'][-1])

def test_eval_metrics_lgbmclassifier(self):
Copy link
Collaborator

@StrikerRUS StrikerRUS Jul 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it make sense to include the content of this test into already existing huge unit test for metrics as an enhancement for the case of ... invalid multiclass metric is replaced with binary alternative ...

def test_metrics(self):

# invalid multiclass metric is replaced with binary alternative for custom objective
gbm = lgb.LGBMClassifier(objective=custom_dummy_obj,
**params).fit(eval_metric='multi_logloss', **params_fit)
self.assertEqual(len(gbm.evals_result_['training']), 1)
self.assertIn('binary_logloss', gbm.evals_result_['training'])

Or just please move this new test closer to that one because they are both about the same things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StrikerRUS yes definitely makes sense. I've moved the test inside of test_metrics but placed it next to the section

non-default metric with multiple metrics in eval_metric

as it is highly related to that test. Thanks!

X_classification, y_classification = load_breast_cancer(True)
params_classification = {'n_estimators': 2, 'verbose': -1, 'objective': 'binary', 'metric': 'binary_logloss'}
params_fit_classification = {'X': X_classification, 'y': y_classification,
'eval_set': (X_classification, y_classification), 'verbose': False}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix one linting error:

./tests/python_package_test/test_sklearn.py:547:23: E128 continuation line under-indented for visual indent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Sorry for the inconvenience, but could you please rebase to the latest master? I've just fixed R-package builds: #3225. Those failures prevents from merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased, hopefully I did it correctly as it is my first time rebasing an upstream branch 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
But seems that something went wrong: see number of changed files

image

Maybe you can allow editing from maintainers and then I'll be able to help with cleaning up the rebasing?
https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/allowing-changes-to-a-pull-request-branch-created-from-a-fork

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2020-07-14 at 22 34 14

editing for maintainers is enabled. Sorry for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Seems I was able to exclude excess files.

@StrikerRUS StrikerRUS force-pushed the fix-lgbmclassifier-eval-metric branch from 0c53735 to 5bcc9ec Compare July 14, 2020 15:56
@StrikerRUS StrikerRUS merged commit 7b8b515 into microsoft:master Jul 14, 2020
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants